Goto

Collaborating Authors

 Starland County


Advancing Embodied Intelligence in Robotic-Assisted Endovascular Procedures: A Systematic Review of AI Solutions

Yao, Tianliang, Lu, Bo, Kowarschik, Markus, Yuan, Yixuan, Zhao, Hubin, Ourselin, Sebastien, Althoefer, Kaspar, Ge, Junbo, Qi, Peng

arXiv.org Artificial Intelligence

Endovascular procedures have revolutionized vascular disease treatment, yet their manual execution is challenged by the demands for high precision, operator fatigue, and radiation exposure. Robotic systems have emerged as transformative solutions to mitigate these inherent limitations. A pivotal moment has arrived, where a confluence of pressing clinical needs and breakthroughs in AI creates an opportunity for a paradigm shift toward Embodied Intelligence (EI), enabling robots to navigate complex vascular networks and adapt to dynamic physiological conditions. Data-driven approaches, leveraging advanced computer vision, medical image analysis, and machine learning, drive this evolution by enabling real-time vessel segmentation, device tracking, and anatomical landmark detection. Reinforcement learning and imitation learning further enhance navigation strategies and replicate expert techniques. This review systematically analyzes the integration of EI into endovascular robotics, identifying profound systemic challenges such as the heterogeneity in validation standards and the gap between human mimicry and machine-native capabilities. Based on this analysis, a conceptual roadmap is proposed that reframes the ultimate objective away from systems that supplant clinical decision-making. This vision of augmented intelligence, where the clinician's role evolves into that of a high-level supervisor, provides a principled foundation for the future of the field.


Multi-modal Co-learning for Earth Observation: Enhancing single-modality models via modality collaboration

Mena, Francisco, Ienco, Dino, Dantas, Cassio F., Interdonato, Roberto, Dengel, Andreas

arXiv.org Artificial Intelligence

Multi-modal co-learning is emerging as an effective paradigm in machine learning, enabling models to collaboratively learn from different modalities to enhance single-modality predictions. Earth Observation (EO) represents a quintessential domain for multi-modal data analysis, wherein diverse remote sensors collect data to sense our planet. This unprecedented volume of data introduces novel challenges. Specifically, the access to the same sensor modalities at both training and inference stages becomes increasingly complex based on real-world constraints affecting remote sensing platforms. In this context, multi-modal co-learning presents a promising strategy to leverage the vast amount of sensor-derived data available at the training stage to improve single-modality models for inference-time deployment. Most current research efforts focus on designing customized solutions for either particular downstream tasks or specific modalities available at the inference stage. To address this, we propose a novel multi-modal co-learning framework capable of generalizing across various tasks without targeting a specific modality for inference. Our approach combines contrastive and modality discriminative learning together to guide single-modality models to structure the internal model manifold into modality-shared and modality-specific information. We evaluate our framework on four EO benchmarks spanning classification and regression tasks across different sensor modalities, where only one of the modalities available during training is accessible at inference time. Our results demonstrate consistent predictive improvements over state-of-the-art approaches from the recent machine learning and computer vision literature, as well as EO-specific methods. The obtained findings validate our framework in the single-modality inference scenarios across a diverse range of EO applications.



TCM-5CEval: Extended Deep Evaluation Benchmark for LLM's Comprehensive Clinical Research Competence in Traditional Chinese Medicine

Huang, Tianai, Chen, Jiayuan, Lu, Lu, Chen, Pengcheng, Li, Tianbin, Han, Bing, Tang, Wenchao, Xu, Jie, Li, Ming

arXiv.org Artificial Intelligence

Large language models (LLMs) have demonstrated exceptional capabilities in general domains, yet their application in highly specialized and culturally-rich fields like Traditional Chinese Medicine (TCM) requires rigorous and nuanced evaluation. Building upon prior foundational work such as TCM-3CEval, which highlighted systemic knowledge gaps and the importance of cultural-contextual alignment, we introduce TCM-5CEval, a more granular and comprehensive benchmark. TCM-5CEval is designed to assess LLMs across five critical dimensions: (1) Core Knowledge (TCM-Exam), (2) Classical Literacy (TCM-LitQA), (3) Clinical Decision-making (TCM-MRCD), (4) Chinese Materia Medica (TCM-CMM), and (5) Clinical Non-pharmacological Therapy (TCM-ClinNPT). We conducted a thorough evaluation of fifteen prominent LLMs, revealing significant performance disparities and identifying top-performing models like deepseek\_r1 and gemini\_2\_5\_pro. Our findings show that while models exhibit proficiency in recalling foundational knowledge, they struggle with the interpretative complexities of classical texts. Critically, permutation-based consistency testing reveals widespread fragilities in model inference. All evaluated models, including the highest-scoring ones, displayed a substantial performance degradation when faced with varied question option ordering, indicating a pervasive sensitivity to positional bias and a lack of robust understanding. TCM-5CEval not only provides a more detailed diagnostic tool for LLM capabilities in TCM but aldso exposes fundamental weaknesses in their reasoning stability. To promote further research and standardized comparison, TCM-5CEval has been uploaded to the Medbench platform, joining its predecessor in the "In-depth Challenge for Comprehensive TCM Abilities" special track.



The causal structure of galactic astrophysics

Desmond, Harry, Ramsey, Joseph

arXiv.org Artificial Intelligence

ABSTRACT Data-driven astrophysics currently relies on the detection and characterisation of correlations between objects' properties, which are then used to test physical theories that make predictions for them. This process fails to utilise information in the data that forms a crucial part of the theories' predictions, namely which variables are directly correlated (as opposed to accidentally correlated through others), the directions of these determinations, and the presence or absence of confounders that correlate variables in the dataset but are themselves absent from it. We propose to recover this information through causal discovery, a well-developed methodology for inferring the causal structure of datasets that is however almost entirely unknown to astrophysics. INTRODUCTION Understanding the physical processes that shape galaxies is a central goal of astrophysics. Empirical progress has traditionally relied on identifying correlations between observed properties, which can then be interpreted in light of theoretical models for galaxy formation and used to constrain them. The advent of large surveys and powerful machine learning techniques has greatly expanded our ability to find such statistical associations, uncovering intricate patterns across high-dimensional parameter spaces. However, correlation alone cannot determine causal influences among variables: which properties are actually responsible for determining others, in what direction this influence goes, and whether there exist confounding variables that are not included in the dataset but influence those that are.


Position: Biology is the Challenge Physics-Informed ML Needs to Evolve

Martinelli, Julien

arXiv.org Artificial Intelligence

Physics-Informed Machine Learning (PIML) has successfully integrated mechanistic understanding into machine learning, particularly in domains governed by well-known physical laws. This success has motivated efforts to apply PIML to biology, a field rich in dynamical systems but shaped by different constraints. Biological modeling, however, presents unique challenges: multi-faceted and uncertain prior knowledge, heterogeneous and noisy data, partial observability, and complex, high-dimensional networks. In this position paper, we argue that these challenges should not be seen as obstacles to PIML, but as catalysts for its evolution. We propose Biology-Informed Machine Learning (BIML): a principled extension of PIML that retains its structural grounding while adapting to the practical realities of biology. Rather than replacing PIML, BIML retools its methods to operate under softer, probabilistic forms of prior knowledge. We outline four foundational pillars as a roadmap for this transition: uncertainty quantification, contextualization, constrained latent structure inference, and scalability. Foundation Models and Large Language Models will be key enablers, bridging human expertise with computational modeling. We conclude with concrete recommendations to build the BIML ecosystem and channel PIML-inspired innovation toward challenges of high scientific and societal relevance.


Topological Structure Learning Should Be A Research Priority for LLM-Based Multi-Agent Systems

Yang, Jiaxi, Zhang, Mengqi, Jin, Yiqiao, Chen, Hao, Wen, Qingsong, Lin, Lu, He, Yi, Kumar, Srijan, Xu, Weijie, Evans, James, Wang, Jindong

arXiv.org Artificial Intelligence

Large Language Model-based Multi-Agent Systems (MASs) have emerged as a powerful paradigm for tackling complex tasks through collaborative intelligence. However, the topology of these systems--how agents in MASs should be configured, connected, and coordinated--remains largely unexplored. In this position paper, we call for a paradigm shift toward \emph{topology-aware MASs} that explicitly model and dynamically optimize the structure of inter-agent interactions. We identify three fundamental components--agents, communication links, and overall topology--that collectively determine the system's adaptability, efficiency, robustness, and fairness. To operationalize this vision, we introduce a systematic three-stage framework: 1) agent selection, 2) structure profiling, and 3) topology synthesis. This framework not only provides a principled foundation for designing MASs but also opens new research frontiers across language modeling, reinforcement learning, graph learning, and generative modeling to ultimately unleash their full potential in complex real-world applications. We conclude by outlining key challenges and opportunities in MASs evaluation. We hope our framework and perspectives offer critical new insights in the era of agentic AI.


A Comprehensive Review of Diffusion Models in Smart Agriculture: Progress, Applications, and Challenges

Hu, Xing, Chen, Haodong, Duan, Qianqian, Zhang, Dawei

arXiv.org Artificial Intelligence

With the global population increasing and arable land resources becoming increasingly limited, smart and precision agriculture have emerged as essential directions for sustainable agricultural development. Artificial intelligence (AI), particularly deep learning models, has been widely adopted in applications such as crop monitoring, pest detection, and yield prediction. Among recent generative models, diffusion models have demonstrated considerable potential in agricultural image processing, data augmentation, and remote sensing analysis. Compared to traditional generative adversarial networks (GANs), diffusion models exhibit greater training stability and superior image generation quality, effectively addressing challenges such as limited annotated datasets and imbalanced sample distributions in agricultural scenarios. This paper reviews recent advancements in the application of diffusion models within agriculture, focusing on their roles in crop disease and pest detection, remote sensing image enhancement, crop growth prediction, and agricultural resource management. Diffusion models have been found useful in improving tasks like image generation, denoising, and data augmentation in agriculture, especially when environmental noise or variability is present. While their high computational requirements and limited generalizability across domains remain concerns, the approach is gradually proving effective in real-world applications such as precision crop monitoring. As research progresses, these models may help support sustainable agriculture and address emerging challenges in food systems.